Overview

Dataset statistics

Number of variables12
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory937.6 KiB
Average record size in memory96.0 B

Variable types

NUM10
CAT2

Reproduction

Analysis started2020-03-04 15:22:41.767835
Analysis finished2020-03-04 15:24:14.997139
Versionpandas-profiling v2.5.3
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
Percent_Of_Forces_Mobilized has 898 (9.0%) zeros Zeros

Variables

Allied_Nations
Real number (ℝ≥0)

Distinct count12
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.3178
Minimum5
Maximum16
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum5
5-th percentile6
Q17
median8
Q39
95-th percentile12
Maximum16
Range11
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.748802159
Coefficient of variation (CV)0.2102481616
Kurtosis1.0669834
Mean8.3178
Median Absolute Deviation (MAD)1.36061364
Skewness0.9151319924
Sum83178
Variance3.058308991
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 5. 5.5 6.5 7.5 8.5 ... 12.5 13.5 14.5 15.5 16. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
8 2820 28.2%
 
7 2498 25.0%
 
9 1447 14.5%
 
10 1052 10.5%
 
6 872 8.7%
 
12 475 4.8%
 
11 451 4.5%
 
5 179 1.8%
 
13 144 1.4%
 
14 28 0.3%
 
Other values (2) 34 0.3%
 
ValueCountFrequency (%) 
5 179 1.8%
 
6 872 8.7%
 
7 2498 25.0%
 
8 2820 28.2%
 
9 1447 14.5%
 
ValueCountFrequency (%) 
16 27 0.3%
 
15 7 0.1%
 
14 28 0.3%
 
13 144 1.4%
 
12 475 4.8%
 
Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
1
5233
0
4763
2
 
4
ValueCountFrequency (%) 
1 5233 52.3%
 
0 4763 47.6%
 
2 4 < 0.1%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 3 100.0%
 
ValueCountFrequency (%) 
Common 3 100.0%
 
ValueCountFrequency (%) 
ASCII 3 100.0%
 

Percent_Of_Forces_Mobilized
Real number (ℝ≥0)

ZEROS
Distinct count80
Unique (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.27054800000000007
Minimum0.0
Maximum1.0
Zeros898
Zeros (%)9.0%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10.09
median0.26
Q30.43
95-th percentile0.6
Maximum1
Range1
Interquartile range (IQR)0.34

Descriptive statistics

Standard deviation0.1964336486
Coefficient of variation (CV)0.7260584023
Kurtosis-0.8403427291
Mean0.270548
Median Absolute Deviation (MAD)0.166993072
Skewness0.3102497686
Sum2705.48
Variance0.03858617831
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.005 0.015 0.025 0.065 ... 0.665 0.675 0.685 0.785 1. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 898 9.0%
 
0.49 416 4.2%
 
0.02 297 3.0%
 
0.24 289 2.9%
 
0.08 252 2.5%
 
0.26 216 2.2%
 
0.1 212 2.1%
 
0.42 204 2.0%
 
0.21 196 2.0%
 
0.31 195 1.9%
 
Other values (70) 6825 68.2%
 
ValueCountFrequency (%) 
0 898 9.0%
 
0.01 181 1.8%
 
0.02 297 3.0%
 
0.03 189 1.9%
 
0.04 155 1.6%
 
ValueCountFrequency (%) 
1 6 0.1%
 
0.79 7 0.1%
 
0.78 9 0.1%
 
0.76 16 0.2%
 
0.75 6 0.1%
 

Hostile_Nations
Real number (ℝ≥0)

Distinct count14
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.5023
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum1
5-th percentile2
Q12
median2
Q33
95-th percentile5
Maximum16
Range15
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.311780007
Coefficient of variation (CV)0.5242297115
Kurtosis27.7949184
Mean2.5023
Median Absolute Deviation (MAD)0.78611936
Skewness4.352497103
Sum25023
Variance1.720766787
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 2.5 3.5 4.5 5.5 6.5 8.5 10. 16. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2 6910 69.1%
 
3 1807 18.1%
 
4 432 4.3%
 
1 306 3.1%
 
6 203 2.0%
 
5 141 1.4%
 
8 73 0.7%
 
7 54 0.5%
 
9 30 0.3%
 
14 11 0.1%
 
Other values (4) 33 0.3%
 
ValueCountFrequency (%) 
1 306 3.1%
 
2 6910 69.1%
 
3 1807 18.1%
 
4 432 4.3%
 
5 141 1.4%
 
ValueCountFrequency (%) 
16 6 0.1%
 
15 7 0.1%
 
14 11 0.1%
 
13 9 0.1%
 
11 11 0.1%
 

Active_Threats
Real number (ℝ≥0)

Distinct count60
Unique (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.7543
Minimum1.0
Maximum72.0
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum1
5-th percentile4
Q17
median13
Q321
95-th percentile35
Maximum72
Range71
Interquartile range (IQR)14

Descriptive statistics

Standard deviation10.38890632
Coefficient of variation (CV)0.6594330637
Kurtosis1.895651554
Mean15.7543
Median Absolute Deviation (MAD)8.16312682
Skewness1.232348629
Sum157543
Variance107.9293744
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 2.5 4.5 5.25 5.75 ... 42.5 50.5 51.5 54.5 72. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
6 885 8.8%
 
5 684 6.8%
 
15 470 4.7%
 
12 465 4.7%
 
10 454 4.5%
 
7 447 4.5%
 
9 441 4.4%
 
17 377 3.8%
 
11 368 3.7%
 
16 367 3.7%
 
Other values (50) 5042 50.4%
 
ValueCountFrequency (%) 
1 13 0.1%
 
2 10 0.1%
 
3 291 2.9%
 
4 279 2.8%
 
5 684 6.8%
 
ValueCountFrequency (%) 
72 9 0.1%
 
68 3 < 0.1%
 
66 6 0.1%
 
57 11 0.1%
 
55 5 0.1%
 

Inactive_Threats
Real number (ℝ≥0)

Distinct count144
Unique (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.5346
Minimum6.0
Maximum289.0
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum6
5-th percentile11
Q121
median37
Q362
95-th percentile116
Maximum289
Range283
Interquartile range (IQR)41

Descriptive statistics

Standard deviation34.01264895
Coefficient of variation (CV)0.7309109556
Kurtosis4.807620511
Mean46.5346
Median Absolute Deviation (MAD)25.92556424
Skewness1.669571947
Sum465346
Variance1156.860289
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 6. 7.5 9.5 27.5 28.5 ... 140.5 152.5 162.5 283.5 289. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
28 273 2.7%
 
24 239 2.4%
 
14 229 2.3%
 
12 214 2.1%
 
20 212 2.1%
 
19 212 2.1%
 
23 200 2.0%
 
15 198 2.0%
 
13 196 2.0%
 
21 191 1.9%
 
Other values (134) 7836 78.4%
 
ValueCountFrequency (%) 
6 8 0.1%
 
7 36 0.4%
 
8 75 0.8%
 
9 111 1.1%
 
10 156 1.6%
 
ValueCountFrequency (%) 
289 7 0.1%
 
278 13 0.1%
 
165 7 0.1%
 
160 9 0.1%
 
155 5 0.1%
 

Citizen_Fear_Index
Real number (ℝ≥0)

Distinct count436
Unique (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.48663898678413764
Minimum0.0
Maximum1.0
Zeros7
Zeros (%)0.1%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile0.2577092511
Q10.406020558
median0.486784141
Q30.5675477239
95-th percentile0.7143906021
Maximum1
Range1
Interquartile range (IQR)0.1615271659

Descriptive statistics

Standard deviation0.1345518192
Coefficient of variation (CV)0.2764920667
Kurtosis0.8368357676
Mean0.4866389868
Median Absolute Deviation (MAD)0.102734544
Skewness0.05851077877
Sum4866.389868
Variance0.01810419204
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.10646109 0.11674009 0.16960352 0.17180617 ... 0.74192364 0.74522761 0.7804699 0.83553598 1. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.4941262849 260 2.6%
 
0.5528634361 236 2.4%
 
0.5234948605 214 2.1%
 
0.5822320117 203 2.0%
 
0.5675477239 183 1.8%
 
0.5088105727 177 1.8%
 
0.4500734214 172 1.7%
 
0.5969162996 153 1.5%
 
0.4794419971 146 1.5%
 
0.4647577093 145 1.5%
 
Other values (426) 8111 81.1%
 
ValueCountFrequency (%) 
0 7 0.1%
 
0.009544787078 6 0.1%
 
0.04185022026 5 0.1%
 
0.05359765051 8 0.1%
 
0.05653450808 6 0.1%
 
ValueCountFrequency (%) 
1 5 0.1%
 
0.9640234949 9 0.1%
 
0.9603524229 15 0.1%
 
0.9412628488 6 0.1%
 
0.9199706314 16 0.2%
 

Closest_Threat_Distance(km)
Real number (ℝ≥0)

Distinct count89
Unique (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean350.88575799999995
Minimum290.44
Maximum425.06
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum290.44
5-th percentile324.36
Q1340.26
median350.86
Q3360.4
95-th percentile378.42
Maximum425.06
Range134.62
Interquartile range (IQR)20.14

Descriptive statistics

Standard deviation16.28574405
Coefficient of variation (CV)0.04641323758
Kurtosis0.7512898566
Mean350.885758
Median Absolute Deviation (MAD)12.6586391
Skewness0.2317032194
Sum3508857.58
Variance265.2254592
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[290.44 303.69 311.11 312.17 314.29 ... 388.49 390.61 393.79 399.09 425.06], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
349.8 341 3.4%
 
345.56 338 3.4%
 
354.04 336 3.4%
 
351.92 329 3.3%
 
358.28 325 3.2%
 
356.16 296 3.0%
 
359.34 273 2.7%
 
352.98 260 2.6%
 
350.86 258 2.6%
 
347.68 246 2.5%
 
Other values (79) 6998 70.0%
 
ValueCountFrequency (%) 
290.44 3 < 0.1%
 
303.16 6 0.1%
 
304.22 8 0.1%
 
305.28 14 0.1%
 
306.34 12 0.1%
 
ValueCountFrequency (%) 
425.06 14 0.1%
 
413.4 14 0.1%
 
408.1 5 0.1%
 
400.68 9 0.1%
 
397.5 10 0.1%
 
Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
1
8799
0
 
1144
2
 
57
ValueCountFrequency (%) 
1 8799 88.0%
 
0 1144 11.4%
 
2 57 0.6%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 3 100.0%
 
ValueCountFrequency (%) 
Common 3 100.0%
 
ValueCountFrequency (%) 
ASCII 3 100.0%
 

Troops_Mobilized(thousands)
Real number (ℝ≥0)

Distinct count65
Unique (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean917487.4133333334
Minimum739200.0
Maximum1311200.0
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum739200
5-th percentile809600
Q1836000
median897600
Q3976800
95-th percentile1100000
Maximum1311200
Range572000
Interquartile range (IQR)140800

Descriptive statistics

Standard deviation92717.64016
Coefficient of variation (CV)0.101056035
Kurtosis0.07689012417
Mean917487.4133
Median Absolute Deviation (MAD)76855.73773
Skewness0.8070937636
Sum9174874133
Variance8596560797
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 739200. 743600. 783200. 794200. 798600. ... 1122000. 1139600. 1214400. 1271600. 1311200.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
836000 819 8.2%
 
827200 628 6.3%
 
880000 470 4.7%
 
809600 464 4.6%
 
862400 456 4.6%
 
818400 399 4.0%
 
924000 390 3.9%
 
844800 361 3.6%
 
853600 346 3.5%
 
968000 342 3.4%
 
Other values (55) 5325 53.2%
 
ValueCountFrequency (%) 
739200 17 0.2%
 
748000 4 < 0.1%
 
765600 15 0.1%
 
774400 8 0.1%
 
792000 155 1.6%
 
ValueCountFrequency (%) 
1311200 6 0.1%
 
1232000 32 0.3%
 
1196800 29 0.3%
 
1193866.667 6 0.1%
 
1188000 6 0.1%
 

DEFCON_Level
Real number (ℝ≥0)

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.6166
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum1
5-th percentile2
Q12
median3
Q33
95-th percentile4
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8027881955
Coefficient of variation (CV)0.3068058532
Kurtosis-0.08759390366
Mean2.6166
Median Absolute Deviation (MAD)0.68758976
Skewness0.350630683
Sum26166
Variance0.6444688869
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1. 2.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2 4270 42.7%
 
3 3910 39.1%
 
4 1212 12.1%
 
1 498 5.0%
 
5 110 1.1%
 
ValueCountFrequency (%) 
1 498 5.0%
 
2 4270 42.7%
 
3 3910 39.1%
 
4 1212 12.1%
 
5 110 1.1%
 
ValueCountFrequency (%) 
5 110 1.1%
 
4 1212 12.1%
 
3 3910 39.1%
 
2 4270 42.7%
 
1 498 5.0%
 

ID
Real number (ℝ≥0)

UNIFORM
UNIQUE
Distinct count10000
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6266.5542
Minimum2
Maximum12500
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum2
5-th percentile629.95
Q13139.75
median6280.5
Q39391.5
95-th percentile11892.05
Maximum12500
Range12498
Interquartile range (IQR)6251.75

Descriptive statistics

Standard deviation3610.170288
Coefficient of variation (CV)0.5761013426
Kurtosis-1.198599895
Mean6266.5542
Median Absolute Deviation (MAD)3126.53067
Skewness-0.006091181042
Sum62665542
Variance13033329.51
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[2.00e+00 1.25e+04], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2047 1 < 0.1%
 
3427 1 < 0.1%
 
9582 1 < 0.1%
 
3435 1 < 0.1%
 
1386 1 < 0.1%
 
7529 1 < 0.1%
 
5480 1 < 0.1%
 
11623 1 < 0.1%
 
9574 1 < 0.1%
 
7521 1 < 0.1%
 
Other values (9990) 9990 99.9%
 
ValueCountFrequency (%) 
2 1 < 0.1%
 
3 1 < 0.1%
 
4 1 < 0.1%
 
5 1 < 0.1%
 
6 1 < 0.1%
 
ValueCountFrequency (%) 
12500 1 < 0.1%
 
12499 1 < 0.1%
 
12498 1 < 0.1%
 
12497 1 < 0.1%
 
12496 1 < 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

Allied_NationsDiplomatic_Meetings_SetPercent_Of_Forces_MobilizedHostile_NationsActive_ThreatsInactive_ThreatsCitizen_Fear_IndexClosest_Threat_Distance(km)Aircraft_Carriers_RespondingTroops_Mobilized(thousands)DEFCON_LevelID
01610.6736.025.00.787812324.361862400.037570
1810.11238.048.00.419236359.340959200.0412128
2910.49332.069.00.582232332.841836000.032181
3700.30231.052.00.589574358.281924000.035946
4810.12513.042.00.552863364.640968000.029054
5610.1024.011.00.508811387.961880000.0210947
6710.08242.052.00.357562349.8011038400.044717
7810.10234.045.00.378120348.7411038400.048008
81000.65718.039.00.656388333.901959200.028179
9800.29318.058.00.618943349.801800800.026324

Last rows

Allied_NationsDiplomatic_Meetings_SetPercent_Of_Forces_MobilizedHostile_NationsActive_ThreatsInactive_ThreatsCitizen_Fear_IndexClosest_Threat_Distance(km)Aircraft_Carriers_RespondingTroops_Mobilized(thousands)DEFCON_LevelID
9990710.0736.018.00.450073364.641950400.03193
9991700.22221.051.00.420705354.041818400.0212323
99921310.45619.064.00.964023349.801950400.039869
99931010.31230.092.00.802496341.321915200.035842
9994900.30220.072.00.501468332.841862400.0211039
99951110.4955.013.00.919971329.6611029600.0411493
99961000.52214.028.00.433921348.7411047200.03305
9997710.13225.042.00.470631366.761941600.03612
99981010.42221.084.00.662996348.741836000.024963
9999810.24215.0105.00.423642355.101862400.029387